Focused word segmentation for ASR

نویسندگان

  • Amarnag Subramanya
  • Jeff A. Bilmes
  • Chia-Ping Chen
چکیده

We propose a new set of features based on the temporal statistics of the spectral entropy of speech. We show why these features make good inputs for a speech detector. Moreover, we propose a back-end that uses the evidence from the above features in a ‘focused’ manner. Subsequently, by means of recognition experiments we show that using the above back-end leads to significant performance improvements, but merely appending the features to the standard feature vector does not improve performance. We also report a 10% average improvement in word error rate over our baseline for the highly mis-matched case in the Aurora3.0 corpus.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers

In spite of decades of research, Automatic Speech Recognition (ASR) is far from reaching the goal of performance close to Human Speech Recognition (HSR). One of the reasons for unsatisfactory performance of the state-of-the-art ASR systems, that are based largely on Hidden Markov Models (HMMs), is the inferior acoustic modeling of low level or phonetic level linguistic information in the speech...

متن کامل

Morpheme Segmentation and Concatenation Approaches for Uyghur LVCSR

In this paper, various kinds of sub-word lexica are thoroughly investigated under the framework of Uyghur LVCSR system. Experimental results show that it is inefficient to directly model based on word units or small units like morpheme or even syllable units. It is observed that an optimal sub-word unit set between word and morpheme units can better fit for ASR system. In order to select best u...

متن کامل

Morphological filtering of speech spectrograms in the context of additive noise

A recent approach to signal segmentation in additive noise [1, 2] uses features of small spectrogram sub-units accrued over the full spectrogram. The original work considered chirp signals in additive white Gaussian noise. This paper extends this work first by considering similar signals at different signal-to-noise ratios and then in the context of speech recognition. For the chirp case, a cos...

متن کامل

Glottal stop detection in German-accented English using ASR

Glottal stops are sounds produced by the closing and abrupt opening of the vocal folds. In German, glottal stops are frequent before word-initial vowels and are a salient word-linking technique. For this reason, German speakers can transfer this word-linking habit to their English productions. Neither in English nor in German are glottal stops phonemes, i.e. they cannot differentiate word meani...

متن کامل

Towards better language modeling for Thai LVCSR

One of the difficulties of Thai language modeling is the process of text corpus preparation. Because there is no explicit word boundary marker in written Thai text, word segmentation must be performed prior to training a language model. This paper presents two approaches to language model construction for Thai LVCSR based on pseudo-morpheme merging. The first approach merges pseudo-morphemes us...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005